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invention Disclosure Form Supplement 

1. NAMES 

Chrfetoph A. Aktas, John W. Yates, Phillip C. Meredith 

2. TITLE 

A subsystem for device end media sensitive access and processing of multimedia 
messages 

3. PURPOSE AND PROBABLE FIELD OF USE OF THE INVENTION 

The purpose of this invention is to provide a subscriber or a user of a unified messaging mailbox 
with efficient intelligent media- and device- sensitive methods to access and process (e.g., read* 
Bsten, forward, and search) messages. The Invention introduces Information summarization and 
media conversion capabilities to selectively treat multimedia messages end message 
attachments so that they can be efficiently handled by mobile devices ike PDAs (Personal Digital 
Assistants), pagers, or phone devices (with or without a text display feature). Furthermore, the 
invention introduces message content analysis capabilities that wfll recognize linguistic 
relationships between messages regardless of the media type. The invention also describes the 
ability to present these linguistic relationships as a "graph" along with the standard messaging 
relationships (Message arrival time, subject, sender, etc) And, finally, the invention also 
introduces a message referencing option that allows simpler message selection from certain 
devices. 



4. PLANNED USE IN PRODUCTS 

The system could be potentially integrated in all our messaging, mobility, and collaboration 
related products including Xpressions* Specific clients could be developed as separate products 
and work with any existing messaging system. The capabilities could also be Integrated into call 
center-based e-mafl communication systems for interaction with customers. New virtual agent 
products could utilize these capabilities. 

5. ABSTRACT 

This invention describes capabilities that will, especially when used in coordination with each 
other and existing capabilities, increase the efficiency of unified maffbox users. Media 
conversion, information summarization, sophisticated data relationship identification, data 
relationship presentation and message selection tools are described that allow the unified malbox 
user organized and efficient access to his/her messages, customized to the device they ere using. 
With these capabilities, the problems of overwhelming amounts of data in a variety of media 
(often incompatible with the user's current display device) will be reduced. 



6. BACKGROUND INFORMATION 

£ 1 What is the problem solved by your invention ? 



Date: 
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A unified mailbox where all kinds of media (voice, fax, e-mail, and video) are made accessible 
and/or visible from virtually anywhere to a subscriber or user in one basket is a convenient means 
of communication when compared to handling multiple mailboxes with distinct media. Current 
solutions for a unified mailbox are inefficient, however, for someone with an intense 
communication style and a frequent need to handle his/her messages remotely. The mismatch 
of media type of the Information and the capabilities of the various (often limited) devices used for 
remote access places a heavy burden on the user and the Interface of the system. This is 
especially true for the interfaces utilizing a telephone with no display, or handheld devices with 
limited display capabilities. 

Some of the problems arise in the context of compound and/tor lengthy messages in connection 
with one or the other access means. For examples: 

• How to handle voice and fax atta c h m ents from a text-only e-mail capable device? 

• How to treat lengthy e-maHs from a voice-only interface or text-interface wtth limited 
capacities. Even when the device has a fully functional GUI Interface, there is room for 
increased efficiency wtth large amounts of data 

• How to efficiently present the information In various office document formats (e.g., Word 
Processor Spreadsheet, and Presentations) associated wtth a message. 

• How to locate and visually present related messages and attachments? 

• How to easily reference messages In the message store? 

Other problems arise due to the increased amount of information the unified mailbox can provide. 
Cunent mechanisms for organizing and presenting relationships among messages (listing by 
arrived time, subject sender, eto) are insufficient for a large number of messages of varying 
media (and, especially, mixed media within a given message). The user requires a flexible, 
media independent way of finding and navigating related messages. With current systems, for 
example, the user Is unable to recognize that there Is a relationship between a voice message 
and a fax without listening to the message and displaying/printing the fax. This Invention 
proposes that the system convert these messages to text, analyze the results and present any 
relationships discovered. 

Finally, because the presentation of unified mailbox information Is more complex, especially if 
relationships as described in the previous paragraph are incorporated into the presentation, 
identifying an individual Item (message or message attachment) for further action can become 
problematic. How does the client/user identify to the server which message is to be acted upon? 
Are the entire message and rts attachments to be involved? Is It a single attachment or only the 
original message body? And if the messages are presented in a "graph" format how does the 
user select an individual item? This Invention also addresses this problem. 



6,2 What techniques prfor to your invention were used to perform the function of your 
invention? 

Current technology that attempts to address these problems varies depending on the user's 
device: 

1. Cunent systems offer media sensitivity for message retrieval when provided with a graphical 
user interface (GUI) from a PC client or Web. tf a particular media or office document is 
attached to an e-mail, the user needs to click-on it in order to launch a specific client (for 
instance: audio player for voice, tiff-viewer for fax, video player to view a video message, 
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Word to a view or print MS-Wbrd document) to listen, to view, or to print the document 

For users with intense communication requirements (e.g. executives or customer service 
agents who receive hundreds of compound messaged daily) there are no means to quickly 
process/read Inbox messages except by the sender information, the subject line, and maybe 
few lines of the message body. In order to reed messages, the user has to effek on or mark a 
certain item in a graphical Interface in order to get to the message body. 

No content summarization of lengthy text messages or respective attachments is available 
yet that would remarkably improve the efficiency of handling the daily information avalanche 
in the office* 

Search is provided in certain clients on e-mail only, but It does not provide visual display of 
content and temporal relationships. No search capability exists yet for non-text messages. 

No powerful message referencing capabilities are offered to efficiently handle multimedia 
messages from remote devices (e,g. telephone). 

2. If a unified mailbox is accessed from a telephone interface, voice and e-mail messages are 
retrievable and the user can listen to both; here, text-to-speech technology provides a means 
to convert the e-mail to voice. A fax message can be forwarded to a fax machine or printer. 
However, if an e-mail contains a voice attachment, the systems are able to indicate that, but 
are unable to access its content Similarly, the contents of a fax or other documents attached 
to an e-mail are indicated but not accessible to the user using the telephone interface. 

If an e-mail Is lengthy, based on the interface provided, the user maybe able navigate through 
it buy accelerating its reading, skipping parts etc in order to listen to it completely. There is no 
means of text content summarization applied to shorten the process without eventually 
losing/skipping critical content 

3. If messages are forwarded to a handheld device via a wireless service but the device has 
limited text-display capabilities only certain parts of the email (From, Subject and a limited 
number of characters of the message body) can be displayed. 

If the critical information in the message is not in the beginning of the message body that is 
displayed, It is "lost" to the recipient He/she has to use other access methods (like \Ateb) or 
make a caB into the messaging system/server to retrieve the foil text message (by listening to 
it or by initiating a printing to tax device nearby). 

importantly, voice and other media attachments are indicated but not transmitted and/br 
displayed on a text-onfy display. The user needs to use other access methods (Web and/br 
Telephone) to retrieve me messages. Additionally, no text content summarization methods 
are utilized to deal with access device technology (Imitations. 

6.3 What am the disadvantages of these prior techniques? 

Full message sensitivity i$ provided within a GUI-only (desktop PC client or Web). However even 
GUI Interfaces lack any means to summarize message content in order to make it more efficient 
to the recipient to read his/her lengthy messages. Also, there are yet no means to summarize 
content of attached documents. 
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With in the telephony interface, the media and device sensitivity is limited to voice and e-mail 
Again, no techniques of text content summarization are applied yet in ortterto mate the retrieval 
of the message information over the phone more convenient 

In case of handheld or mobile devices with limited text-display capabilities, the problem is that 
lengthy messages are not transmitted usually by the wireless/paging service providers. 
Additionally, any other media attachments are "lost - . 

No content summarization of lengthy text messages or respective attachments is available yet 
that would remarkably improve the efficiency of handling the daily information avalanche in the 
office. 

Search is provided in certain clients on e-mail only, but ft does not provide visual display of 
content and temporal relationships. No search capability exists yet for non-text messages. 

No powerful message referencing capabilities are offered to efficiently handle multimedia 
massages from remote devices (eg. telephone). 



6.4 What are the advantages of your invention over the prior techniques? 
The proposed invention solves the problems above by utilizing advanced media conversion 
methods, analysis and summarization of message content and intelligent forwarding concepts. 
Principally it provides access device and media sensitive intelligence for a mailbox when 
retrieving or forwarding a particular message. 

The proposed techniques provide the following advantages: 

a) Advanced media conversion - the concept of media conversion Is extended beyond textoo* 
speech to other attachments; a speaker-independent large vocabulary, telephony-quafity 
speech recognition engine is utilized to convert a voice message to text or to convert the 
voice track of a video attachment into readable text Similarly, fax information is converted 
into text Use of this feature will allow users on devices unable to handle the original media 
type of a message to present the message to the user 

b) Content summarization provides increased efficiency in message handling - the 
summarization of a message content is an improvement toward efficiency, even in a GUI 
environment In a GUf environment lets say in Microsoft Outlook, the user could point to the 
item in the in-box to have displayed in small pop-up box the summary of that message. But its 
benefits are obvious in case of a forwarded lengthy message to a handheld device with 
limited display capabilities. The same is true for reading a message over die phone. 
Summarization applied to attached media (e.g. fox, Word document) extends even the media 
content accessible. 

c) Media and device sensitivity - Both, the media conversion and the content summarization 
applied together provide compatibility with the access device. Depending on the user, the 
types of potential access devices are usually predefined; therefore messages along with their 
attachments that form the message content can be tailored to those devices while accessed 
or forwarded according to a profile This ensures the availability of more information to the 
recipient at the device of choice and that is probably most convenient Stiff, If the user 
requires more information, he/she can utilize another access method (telephone or Wfeb). 
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d) Cross-Media Search and Visual Display- Often messages related to a specific topic of 
interest to the user are in different media and spread throughout (he message store (ag. 
different folders). The cross-media search would find these messages and present them to 
the user In a way that makes the content and time relationships clear - allowing efficient use 
of the otherwise overwhelming amount of information. Ultimately, the search can utilize 
sophisticated linguistically based analysts tools to discover the message relationships. 

e) Simpler Message Referencing - Additionally, a reference number scheme to messages is 
proposed. An messages in a particular group of messages of Interest to the user would be 
assigned a reference number to be used in further actions. Thus a PDA user can, for 
example, get a summary of messages with reference numbers and an indication of the 
message type. This reference number could then be used to access that message, and 
through it a particular attachment of that message for further action (Print the first attachment 
of message 7 to fax; play message 6 on telephone). Ultimately, voice commands could be 
used to invoke actions on items more efficiently fFax item 12345 to my hotel") 

0 Combinations of the above features - Powerful user interface functions can be built from 
these features. TTie sample scenario below shows one possibility. Many others are posstofe. 



7. DETAILED DESCRIPTION 



7.1 Detected structural and functional operation 

The basic structural components required for the proposed features are: 



• A multimedia PC with a client program that provides a multimedia message inbox. 
Alternatively, a server that provides multimedia message inbox tor several users on a 
network- 

» A subsystem that detects media attachments in messages in a mailbox. 

• A subsystem that converts media attachments into another media type using text-to-speech, 
fax-to-text video voice track into text and ultimately speech-to-text 

• A subsystem that analyzes and summarizes text content of original or converted media in 
respect of the linguistic meaning 

• A subsystem that pushes appropriate media according to an access devfee end message 
purpose, as defined in a profile 

• A subsystem that identifies cross-media interrelationships between messages and controls 
the media conversions necessary for this analysis. 

• A subsystem that controls the reference scheme. 



Sample Scenario: Notification of a Single-Media Voice Message to a Date Pager 

The following describes an example of this process involving a user that has a multfmedia 
mailbox and a data pager who receives a Voice Message. The problem is to provide the "best" 
information to the pager so the user can proceed most efficiently. What is the "best" information 
will vary according to the user's actual preferences, but will most likely include sender 
identification and meaningful portions of the message itself. In addition, there are probably 
messages the user would prefer to delay any handling of until an appropriate device Is available. 
Thus the steps would include: a) filtering messages to be processed, b) speech-to4axt 
conversion, c) summarization and post filtering, and d) selection and delivery of this information to 
the device. 



Witness: 
Date: .... 
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Pre-Fiftering 

As the resources involved in processing a message may be large. SpeocMo-Text is "expensive" 
in its use of resources. Interrupting the user with all but the most important messages can be an 
unnecessary expense of the user's time and attention. Thus a mechanism to prevent the 
presentation of a message to a given device is important This filtering ts currently available on 
products like Microsoft Exchange/Outbok that can fitter based on a variety of data including 
sender, message priority, etc. 

Speech-ToText Conversion 

With a speaker/microphone independent speech-to-text engine, the selected Voice Message then 
can be converted to text 1 . This would be most efficiency accomplished on the server side, 
perhaps with a dedicated 'helper* server explicitly for the server so as not to disturb other 
processing on the server. The resulting text message would then be associated with the original 
message (as the text message body or as a separate attachment). - 

Post-Conversion Filtering and Summarization 

Another filtering step might then be appropriate, preventing processing of messages that appear 
not to be on a topic deemed important to the user, if ft does not appear Important it would then 
remain in the mailbox to be processed. 

If the message survives the filtering step, me text would then be summarized. Most simply, 
summarization would include reduction to a list of keywords and phrases found within the text 
The summarization would be created by removing from the message words/phrases not found 
within the user-defined list of keywords/phrases. More complex summarization would include 
allowing the user to specify the keyword/phrase list based on the sender of the message 2 . The 
most complex summarisation method would involve sophisticated grammatical parsing and 
analysis. 

Data Selection 

For notification via data pager, the user would configure what part of the filtered summarized 
message they wish to be sent The data available for selection would include Sender Name, 
Time, Summary, Message Priority and un-summarized Text (and other fields as avaOable)* The 
user would describe a template that indicates the information desired and the number of 
characters of each field desired. For example 3 : 

From t SENDER* at %TIM&%s %100SUMMAJRY% 

Would Indicate the user wants a string that includes the entire sender name, the received time 
and the first 100 characters of the summary to appear on his pager. 

When the user receives the page, the summary information should give him/her enough 
information to determine how critical the message is. If it appears critical, he/she could get to a 
more appropriate device (e.g. a telephone) and ilsten to the foil message. 



1 Interim solutions for non-speaker/microphone independent engines will be discussed in a separate 
document (maybe in section 7.2). 

1 Since the message will be a speech-to-text conversion, the keywords AND THEIR HOMONYMS should 
be checked. An option on the summarization, like a check box that says "Allow homonyms" would seem 
the best way to handle this. 

' The exact template format is to be determined; this is just to describe the concept 



Date: „.} 
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7.2 Are there alternative methods or different structur&f embodiments of your invention? Can 
the general idea or technique of your invention be extended to other related fields? 

a) The summarization concept can be applied in context of any other handheld device that is not 
limited to text only (e.g. Windows and Windows/CE devices) 

b) A morphing process for a certain message in context of any particular target device can be 
defined. The morphing process would be a combination of message filtering, message 
restructuring, data conversion, data summarization, data selection and notification steps that 
are configured to handle particular media types for particular target devices. Each user could 
define a set of rules and parameters for each device type defined would determine how the 
message is morphed. For example, a user could have a Voice Message-to-Pager morph 
definition that would do the fioQowing: 

1 . Filter based on sender and priority, removing from further processing (i.e. leaving on the 
server) message that are not deemed urgent enough to disturb the user while out of the 
office, 

2. The message would go through speech-to-text conversion and the resulting text stored back 
in the message. 

3. The text would be summarized based on criteria defined by the user. 

4. Another Filtering step, based on the summarized/converted text would then weed out items 
whose content makes them less urgent 

5. Data fields from the message (like sender, priority, summary and text) are selected and a 
notification message created. 

6. The message is sent to the pager as a notification. 

In general, a morphing process will include these steps in some order determined by the user. In 
addition, message restructuring steps would allow the user to handle multiple attachments of 
varying media on the message. For example, the user could select that a summary of the 
attachments be created (attachment name and media type) - or could request that the 
attachments be expanded, converted and summarized as described for the single media 
message above. 

Examples 

• Voice Message to Data Pager (as I described already ) 

• Text Message to Phone 

1 . A Fitter out unwanted messages based on sender and priority. 

2. Summarize message to reduce Its size. 

3. Convert text-to-speech. 

4. Play the converted message over the phone. 

• Fax-fo-PDA 

1 . Filter out unwanted messages based on sender and priority. 

2. Convert fox-to-text 

3. Summarize. 

4. Select Data Fields 

5. Send the summary to the PDA and "Nottfy" the user. 
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7.3 Which features are believed to to new? 

• Conversion of media applied to any media (voice, fax, video) in a mailbox 

• Text summarization of original or converted media from a multimedia mailbox 

• Improved efficiency even in a graphical environment by summarizing the content of a 
message according to its linguistic meaning 

• Media and device sensitivity when accessing multimedia messages in a message inbox 

• Linguistically based search for relationships between message/message attachments of 
differing media types. 

• Integration of these capabilities into useful features for the mailbox user. 
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